NYU: Description of the MENE Named Entity System as Used in MUC-7
نویسندگان
چکیده
This paper describes a new system called \Maximum Entropy Named Entity" or \MENE" (pronounced \meanie") which was NYU's entrant in the MUC-7 named entity evaluation. By working within the framework of maximum entropy theory and utilizing a exible object-based architecture, the system is able to make use of an extraordinarily diverse range of knowledge sources in making its tagging decisions. These knowledge sources include capitalization features, lexical features and features indicating the current type of text (i.e. headline or main body). It makes use of a broad array of dictionaries of useful single or multi-word terms such as rst names, company names, and corporate su xes. These dictionaries required no manual editing and were either downloaded from the web or were simply \obvious" lists entered by hand.
منابع مشابه
PAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملDescription of the Oki System as Used for MUC-7
This paper describes the Oki Information Extraction system as used for MUC-7 evaluation [1][2]. The tasks we have conducted are Named Entity, Co-reference, Template Element and Template Relation. Each module is implemented using MT system modules and pattern recognition modules. Our purposes to participate MUC-7 evaluation are to evaluate howMT system modules are e ective for other application ...
متن کاملExploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition
This paper describes a novel statistical namedentity (i.e. "proper name") recognition system built around a maximum entity framework. By working v,ithin the framework of maximum entropy theory and utilizing a flexible object-based architecture, the system is able to make use of an extraordinarily diverse range of knowledge sources in making its tagging decisions. These knowledge sources include...
متن کاملIsoQuest Inc.: Description Of The NetOwl (TM) Extractor System As Used For MUC-7
IsoQuest used its commercial software product, NetOwl Extractor, for the MUC-7 Named Entity task. The product consists of a high-speed C engine that analyzes text based on a configuration file containing a pattern rule base and lexicon. IsoQuest used the NameTag Configuration to recognize proper names and other key phrases in text, and mapped the product’s extraction tags to the MUC-7 NE tags. ...
متن کاملNYU: Description of the Proteus/PET System as Used for MUC-7 ST
Through the history of the MUC's, adapting Information Extraction (IE) systems to a new class of events has continued to be a time-consuming and expensive task. Since MUC-6, the Information Extraction e ort at NYU has focused on the problem of portability and customization, especially at the scenario level. To begin to address this problem, we have built a set of tools, which allow the user to ...
متن کامل